AITopics | Frisco

6d0f9c415e2d779c78f32b74668e9d02-Paper-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing SystemsFeb-15-2026, 16:17:20 GMT

Fact-checking is extensively studied in the context of misinformation and disinformation, addressing objective inaccuracies. However, a softer form of misinformation involves responses that are factually correct but lack certain features such as clarity and relevance. This challenge is prevalent in formal Question-Answer (QA) settings such as press conferences in finance, politics, sports, and other domains, where subjective answers can obscure transparency. Despite this, there is a lack of manually annotated datasets for subjective features across multiple dimensions. To address this gap, we introduce SubjECTive-QA, a human annotated dataset on Earnings Call Transcripts' (ECTs) QA sessions as the answers given by company representatives are often open to subjective interpretations and scrutiny. The dataset includes 49, 446 annotations for long-form QA pairs across six features: Assertive, Cautious, Optimistic, Specific, Clear, and Relevant . These features are carefully selected to encompass the key attributes that reflect the tone of the answers provided during QA sessions across different domains. Our findings are that the best-performing Pre-trained Language Model (PLM), RoBERTa-base, has similar weighted F1 scores to Llama-3-70b-Chat on features with lower subjectivity, such as Relevant and Clear, with a mean difference of 2 .

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.05)
Asia > India > Maharashtra > Mumbai (0.05)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
(15 more...)

Genre:

Financial News (1.00)
Research Report > New Finding (0.87)

Industry:

Media > News (1.00)
Law (1.00)
Banking & Finance > Trading (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

6d0f9c415e2d779c78f32b74668e9d02-Paper-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing SystemsOct-10-2025, 05:23:42 GMT

Fact-checking is extensively studied in the context of misinformation and disinformation, addressing objective inaccuracies. However, a softer form of misinformation involves responses that are factually correct but lack certain features such as clarity and relevance. This challenge is prevalent in formal Question-Answer (QA) settings such as press conferences in finance, politics, sports, and other domains, where subjective answers can obscure transparency. Despite this, there is a lack of manually annotated datasets for subjective features across multiple dimensions. To address this gap, we introduce SubjECTive-QA, a human annotated dataset on Earnings Call Transcripts' (ECTs) QA sessions as the answers given by company representatives are often open to subjective interpretations and scrutiny. The dataset includes 49, 446 annotations for long-form QA pairs across six features: Assertive, Cautious, Optimistic, Specific, Clear, and Relevant . These features are carefully selected to encompass the key attributes that reflect the tone of the answers provided during QA sessions across different domains. Our findings are that the best-performing Pre-trained Language Model (PLM), RoBERTa-base, has similar weighted F1 scores to Llama-3-70b-Chat on features with lower subjectivity, such as Relevant and Clear, with a mean difference of 2 .

annotation, computational linguistic, dataset, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.05)
Asia > India > Maharashtra > Mumbai (0.05)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
(15 more...)

Genre:

Financial News (1.00)
Research Report > New Finding (0.87)

Industry:

Media > News (1.00)
Law (1.00)
Banking & Finance > Trading (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Comparative Analysis of FOLD-SE vs. FOLD-R++ in Binary Classification and XGBoost in Multi-Category Classification

Murthy, Akshay, Sebastian, Shawn, Shangle, Manil, Wang, Huaduo, Dasgupta, Sopam, Gupta, Gopal

arXiv.org Artificial IntelligenceSep-24-2025

Recently, the demand for Machine Learning (ML) models that can balance accuracy, efficiency, and interpreability has grown significantly. Traditionally, there has been a tradeoff between accuracy and explainability in predictive models, with models such as Neural Networks achieving high accuracy on complex datasets while sacrificing internal transparency. As such, new rule-based algorithms such as FOLD-SE have been developed that provide tangible justification for predictions in the form of interpretable rule sets. The primary objective of this study was to compare FOLD-SE and FOLD-R++, both rule-based classifiers, in binary classification and evaluate how FOLD-SE performs against XGBoost, a widely used ensemble classifier, when applied to multi-category classification. We hypothesized that because FOLD-SE can generate a condensed rule set in a more explainable manner, it would lose upwards of an average of 3 percent in accuracy and F1 score when compared with XGBoost and FOLD-R++ in multiclass and binary classification, respectively. The research used data collections for classification, with accuracy, F1 scores, and processing time as the primary performance measures. Outcomes show that FOLD-SE is superior to FOLD-R++ in terms of binary classification by offering fewer rules but losing a minor percentage of accuracy and efficiency in processing time; in tasks that involve multi-category classifications, FOLD-SE is more precise and far more efficient compared to XGBoost, in addition to generating a comprehensible rule set. The results point out that FOLD-SE is a better choice for both binary tasks and classifications with multiple categories. Therefore, these results demonstrate that rule-based approaches like FOLD-SE can bridge the gap between explainability and performance, highlighting their potential as viable alternatives to black-box models in diverse classification tasks.

artificial intelligence, fold-se, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2509.18139

Country:

North America > United States > Wisconsin (0.05)
North America > United States > Texas > Dallas County > Dallas (0.04)
North America > United States > Texas > Collin County > Frisco (0.04)
(2 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Health & Medicine > Therapeutic Area (0.50)
Banking & Finance (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

DoorDash plans to test drone deliveries in San Francisco warehouse

Los Angeles TimesSep-10-2025, 20:31:24 GMT

Things to Do in L.A. Tap to enable a layout that focuses on the article. Masslie Arias, of DoorDash, prepares to load a delivery package on a hovering drone on July 31 in Frisco, Texas. This is read by an automated voice. Please report any issues or inconsistencies here . Food delivery app DoorDash is setting its sights on a new destination to test out flying drone deliveries: San Francisco.

delivery, doordash, drone delivery, (11 more...)

Los Angeles Times

Country:

North America > United States > California > San Francisco County > San Francisco (0.65)
North America > United States > Texas > Collin County > Frisco (0.25)
North America > United States > California > Los Angeles County > Los Angeles (0.08)
(12 more...)

Industry:

Transportation > Air (1.00)
Information Technology > Services (1.00)
Consumer Products & Services > Food, Beverage, Tobacco & Cannabis (1.00)
Government > Regional Government > North America Government > United States Government (0.96)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)

Add feedback

Prosody as a Teaching Signal for Agent Learning: Exploratory Studies and Algorithmic Implications

Knierim, Matilda, Jain, Sahil, Aydoğan, Murat Han, Mitra, Kenneth, Desai, Kush, Saran, Akanksha, Baraka, Kim

arXiv.org Artificial IntelligenceOct-30-2024

Agent learning from human interaction often relies on explicit signals, but implicit social cues, such as prosody in speech, could provide valuable information for more effective learning. This paper advocates for the integration of prosody as a teaching signal to enhance agent learning from human teachers. Through two exploratory studies--one examining voice feedback in an interactive reinforcement learning setup and the other analyzing restricted audio from human demonstrations in three Atari games--we demonstrate that prosody carries significant information about task dynamics. Our findings suggest that prosodic features, when coupled with explicit feedback, can enhance reinforcement learning outcomes. Moreover, we propose guidelines for prosody-sensitive algorithm design and discuss insights into teaching behavior. Our work underscores the potential of leveraging prosody as an implicit signal for more efficient agent learning, thus advancing human-agent interaction paradigms.

agent, prosody, utterance, (12 more...)

arXiv.org Artificial Intelligence

2410.23554

Country:

North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
(7 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

SubjECTive-QA: Measuring Subjectivity in Earnings Call Transcripts' QA Through Six-Dimensional Feature Analysis

Pardawala, Huzaifa, Sukhani, Siddhant, Shah, Agam, Kejriwal, Veer, Pillai, Abhishek, Bhasin, Rohan, DiBiasio, Andrew, Mandapati, Tarun, Adha, Dhruv, Chava, Sudheer

arXiv.org Artificial IntelligenceOct-27-2024

Fact-checking is extensively studied in the context of misinformation and disinformation, addressing objective inaccuracies. However, a softer form of misinformation involves responses that are factually correct but lack certain features such as clarity and relevance. This challenge is prevalent in formal Question-Answer (QA) settings such as press conferences in finance, politics, sports, and other domains, where subjective answers can obscure transparency. Despite this, there is a lack of manually annotated datasets for subjective features across multiple dimensions. To address this gap, we introduce SubjECTive-QA, a human annotated dataset on Earnings Call Transcripts' (ECTs) QA sessions as the answers given by company representatives are often open to subjective interpretations and scrutiny. The dataset includes 49,446 annotations for long-form QA pairs across six features: Assertive, Cautious, Optimistic, Specific, Clear, and Relevant. These features are carefully selected to encompass the key attributes that reflect the tone of the answers provided during QA sessions across different domain. Our findings are that the best-performing Pre-trained Language Model (PLM), RoBERTa-base, has similar weighted F1 scores to Llama-3-70b-Chat on features with lower subjectivity, such as Relevant and Clear, with a mean difference of 2.17% in their weighted F1 scores. The models perform significantly better on features with higher subjectivity, such as Specific and Assertive, with a mean difference of 10.01% in their weighted F1 scores. Furthermore, testing SubjECTive-QA's generalizability using QAs from White House Press Briefings and Gaggles yields an average weighted F1 score of 65.97% using our best models for each feature, demonstrating broader applicability beyond the financial domain. SubjECTive-QA is publicly available under the CC BY 4.0 license

information retrieval, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2410.20651

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.05)
Asia > India > Maharashtra > Mumbai (0.05)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
(16 more...)

Genre:

Financial News (1.00)
Research Report > New Finding (0.87)

Industry:

Media > News (1.00)
Law (1.00)
Banking & Finance > Trading (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Improving Academic Skills Assessment with NLP and Ensemble Learning

Huang, Xinyi, Wu, Yingyi, Zhang, Danyang, Hu, Jiacheng, Long, Yujian

arXiv.org Artificial IntelligenceOct-13-2024

This study addresses the critical challenges of assessing foundational academic skills by leveraging advancements in natural language processing (NLP). Traditional assessment methods often struggle to provide timely and comprehensive feedback on key cognitive and linguistic aspects, such as coherence, syntax, and analytical reasoning. Our approach integrates multiple state-of-the-art NLP models, including BERT, RoBERTa, BART, DeBERTa, and T5, within an ensemble learning framework. These models are combined through stacking techniques using LightGBM and Ridge regression to enhance predictive accuracy. The methodology involves detailed data preprocessing, feature extraction, and pseudo-label learning to optimize model performance. By incorporating sophisticated NLP techniques and ensemble learning, this study significantly improves the accuracy and efficiency of assessments, offering a robust solution that surpasses traditional methods and opens new avenues for educational technology research focused on enhancing core academic competencies.

arxiv preprint arxiv, assessment, classification, (13 more...)

arXiv.org Artificial Intelligence

2409.19013

Country:

North America > United States > Illinois > Cook County > Chicago (0.05)
North America > United States > Texas > Collin County > Frisco (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Asia > China > Yunnan Province > Kunming (0.04)

Genre: Research Report (0.50)

Industry: Education (0.35)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)
(2 more...)

Add feedback

DataAgent: Evaluating Large Language Models' Ability to Answer Zero-Shot, Natural Language Queries

Mishra, Manit, Braham, Abderrahman, Marsom, Charles, Chung, Bryan, Griffin, Gavin, Sidnerlikar, Dakshesh, Sarin, Chatanya, Rajaram, Arjun

arXiv.org Artificial IntelligenceMar-29-2024

Conventional processes for analyzing datasets and extracting meaningful information are often time-consuming and laborious. Previous work has identified manual, repetitive coding and data collection as major obstacles that hinder data scientists from undertaking more nuanced labor and high-level projects. To combat this, we evaluated OpenAI's GPT-3.5 as a "Language Data Scientist" (LDS) that can extrapolate key findings, including correlations and basic information, from a given dataset. The model was tested on a diverse set of benchmark datasets to evaluate its performance across multiple standards, including data science code-generation based tasks involving libraries such as NumPy, Pandas, Scikit-Learn, and TensorFlow, and was broadly successful in correctly answering a given data science query related to the benchmark dataset. The LDS used various novel prompt engineering techniques to effectively answer a given question, including Chain-of-Thought reinforcement and SayCan prompt engineering. Our findings demonstrate great potential for leveraging Large Language Models for low-level, zero-shot data analysis.

benchmark dataset, dataset, query, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICAIC60265.2024.10433803

2404.00188

Country:

North America > United States > California > Santa Clara County > Sunnyvale (0.05)
South America > Brazil > São Paulo (0.04)
North America > United States > Texas > Collin County > Frisco (0.04)
(12 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Protein Structure Prediction Approach Leveraging Transformer and CNN Integration

Zhou, Yanlin, Tan, Kai, Shen, Xinyu, He, Zheng, Zheng, Haotian

arXiv.org Artificial IntelligenceMar-8-2024

Proteins are essential for life, and their structure determines their function. The protein secondary structure is formed by the folding of the protein primary structure, and the protein tertiary structure is formed by the bending and folding of the secondary structure. Therefore, the study of protein secondary structure is very helpful to the overall understanding of protein structure. Although the accuracy of protein secondary structure prediction has continuously improved with the development of machine learning and deep learning, progress in the field of protein structure prediction, unfortunately, remains insufficient to meet the large demand for protein information. Therefore, based on the advantages of deep learning-based methods in feature extraction and learning ability, this paper adopts a two-dimensional fusion deep neural network model, DstruCCN, which uses Convolutional Neural Networks (CCN) and a supervised Transformer protein language model for single-sequence protein structure prediction. The training features of the two are combined to predict the protein Transformer binding site matrix, and then the three-dimensional structure is reconstructed using energy minimization.

prediction, protein, sequence, (14 more...)

arXiv.org Artificial Intelligence

2402.19095

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Texas > Collin County > Frisco (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.65)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ClickSAM: Fine-tuning Segment Anything Model using click prompts for ultrasound image segmentation

Guo, Aimee, Fei, Gace, Pasupuletic, Hemanth, Wang, Jing

arXiv.org Artificial IntelligenceFeb-9-2024

The newly released Segment Anything Model (SAM) is a popular tool used in image processing due to its superior segmentation accuracy, variety of input prompts, training capabilities, and efficient model design. However, its current model is trained on a diverse dataset not tailored to medical images, particularly ultrasound images. Ultrasound images tend to have a lot of noise, making it difficult to segment out important structures. In this project, we developed ClickSAM, which fine-tunes the Segment Anything Model using click prompts for ultrasound images. ClickSAM has two stages of training: the first stage is trained on single-click prompts centered in the ground-truth contours, and the second stage focuses on improving the model performance through additional positive and negative click prompts. By comparing the first stage's predictions to the ground-truth masks, true positive, false positive, and false negative segments are calculated. Positive clicks are generated using the true positive and false negative segments, and negative clicks are generated using the false positive segments. The Centroidal Voronoi Tessellation algorithm is then employed to collect positive and negative click prompts in each segment that are used to enhance the model performance during the second stage of training. With click-train methods, ClickSAM exhibits superior performance compared to other existing models for ultrasound image segmentation.

click prompt, clicksam, segmentation, (11 more...)

arXiv.org Artificial Intelligence

2402.05902

Country:

North America > United States > Texas > Dallas County > Dallas (0.05)
North America > United States > Texas > Dallas County > Richardson (0.04)
North America > United States > Texas > Collin County > Frisco (0.04)

Genre: Research Report (0.51)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.51)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Filters

Collaborating Authors

Frisco

6d0f9c415e2d779c78f32b74668e9d02-Paper-Datasets_and_Benchmarks_Track.pdf

6d0f9c415e2d779c78f32b74668e9d02-Paper-Datasets_and_Benchmarks_Track.pdf

Comparative Analysis of FOLD-SE vs. FOLD-R++ in Binary Classification and XGBoost in Multi-Category Classification

DoorDash plans to test drone deliveries in San Francisco warehouse

Prosody as a Teaching Signal for Agent Learning: Exploratory Studies and Algorithmic Implications

SubjECTive-QA: Measuring Subjectivity in Earnings Call Transcripts' QA Through Six-Dimensional Feature Analysis

Improving Academic Skills Assessment with NLP and Ensemble Learning

DataAgent: Evaluating Large Language Models' Ability to Answer Zero-Shot, Natural Language Queries

A Protein Structure Prediction Approach Leveraging Transformer and CNN Integration

ClickSAM: Fine-tuning Segment Anything Model using click prompts for ultrasound image segmentation